Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation

نویسندگان

  • Marek Petrik
  • Dharmashankar Subramanian
  • Janusz Marecki
چکیده

We propose solution methods for previouslyunsolved constrained MDPs in which actions can continuously modify the transition probabilities within some acceptable sets. While many methods have been proposed to solve regular MDPs with large state sets, there are few practical approaches for solving constrained MDPs with large action sets. In particular, we show that the continuous action sets can be replaced by their extreme points when the rewards are linear in the modulation. We also develop a tractable optimization formulation for concave reward functions and, surprisingly, also extend it to nonconcave reward functions by using their concave envelopes. We evaluate the effectiveness of the approach on the problem of managing delinquencies in a portfolio of loans.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Capacity, mutual information, and coding for finite-state Markov channels

The Finite-State Markov Channel (FSMC) is a discrete time-varying channel whose variation is determined by a finite-state Markov process. These channels have memory due to the Markov channel variation. We obtain the FSMC capacity as a function of the conditional channel state probability. We also show that for i.i.d. channel inputs, this conditional probability converges weakly, and the channel...

متن کامل

Function Approximation for Continuous Constrained MDPs

In this work we apply function approximation techniques to solve continuous, constrained Markov Decision Processes (MDPs). Many real-world robot planning problems are best represented as MDPs with a continuous state space. However, in many scenarios constraints must be accounted for as well. These constraints are treated probabilistically, with a bound on the constraint violation probability. E...

متن کامل

The Inventory System Management under Uncertain Conditions and Time Value of Money

This study develops a inventory model to determine ordering policy for deteriorating items with shortages under markovian inflationary conditions. Markov processes include process whose future behavior cannot be accurately predicted from its past behavior (except the current or present behavior) and which involves random chance or probability. Behavior of business or economy, flow of traffic, p...

متن کامل

Capacity, Mutual Information, and Coding for Finite-State Markov Channels - Information Theory, IEEE Transactions on

AbstructThe Finite-State Markov Channel (FSMC) is a discrete time-varying channel whose variation is determined by a finite-state Markov process. These channels have memory due to the Markov channel variation. We obtain the FSMC capacity as a function of the conditional channel state probability. We also show that for i.i.d. channel inputs, this conditional probability converges weakly, and the...

متن کامل

Fast Simulation of Markov Fluid Models

In this paper we study continuous ow nite buuer systems with input rates modulated by Markov chains. Discrete event simulations are applied for estimating loss probabilities. The simulations are executed under a twisted version of the original probability measure (importance sampling). We present a simple rule for determining a new measure, then show that the new measure matches thèmost likely'...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1309.6857  شماره 

صفحات  -

تاریخ انتشار 2013